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Abstract 

Background: Despite its high number of endemic deciduous broad-leaved species in China's warm-temperate 
zone, far less attention has been paid to phylogeographic studies in this region. In this work, the phylogeographic 
history of Forsythia suspensa endemic to China's warm-temperate zone was investigated to explore the effect of 
climate change during the Pleistocene on the distribution of this deciduous broad-leaved species in China. 

Results: The cpDNA data revealed seven phylogeographical groups corresponding to geographical regions. By 
contrast, the nrDNA data supported the samples clustered into three groups, which was inconsistent with separate 
geographical regions supported by cpDNA data. Ecological niche modeling showed that the climatically suitable 
area during the cold period was larger than that during the warm period. 

Conclusions: Both molecular data and ecological niche modeling indicated that F suspensa expanded to nearby 
low-elevation plains in the glacial periods, and retreated to mountaintops during interglacial warmer stages. This study 
thus supported that F suspensa persisted in situ during the glacial of the Pleistocene with enlarged distribution area, 
contrary to the hypothesis of long distance southward migration or large-scale range contraction. 

Keywords: cpDNA, nrDNA, Ecological niche model, China's warm-temperate zone, Molecular phylogeography, 
Forsythia suspensa 



Background 

Profound climatic oscillations during the Pleistocene 
resulted in repeated drastic environmental changes, which 
also substantially shaped the species' distribution, evolution, 
and extinction [1-4]. The Last Glacial Maximum (LGM) 
occurred at around 23 000-18 000 years before present, 
which was particularly considered an important factor in- 
fluencing the current plant distribution [4,5]. Numerous 
molecular-based phylogeographical surveys in Europe and 
North America have been extensively studied to unravel 
the locations of refugia, the potential recolonization routes 
and the subsequent evolution and speciation during glacial 
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and post-glacial epochs [6-8]. In contrast to Europe and 
North America, most parts of China were not covered by 
massive ice sheet during LGM. It was clear that ice sheet 
had only developed in certain areas of the Qinghai- Tibetan 
Plateau [9]. In recent years, numerous phylogeographical 
surveys of plant species in China have been informative in 
resolving the location of glacial refugia and routes of 
colonization/range expansion after glacial periods. 

Three hypotheses on the occurrence of refugia and 
postglacial expansion in China have been proposed. The 
first hypothesis is that climatic changes during Pleistocene 
deeply impacted the forests in China, and large-scale vege- 
tation experienced long-distance southward migration 
during glaciations. Thus, the species subsequently colo- 
nized northward from the southern refugia after glaciation 



© 2014 Fu et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative 
Commons Attribution License (http://creativecommons.Org/licenses/by/2.0), which permits unrestricted use, distribution, and 
reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain 
Dedication waiver (http://creativecommons.Org/publicdomain/zero/1.0/) applies to the data made available in this article, 
unless otherwise stated. 



Fu et a I. BMC Evolutionary Biology 2014, 14:1 14 
http://www.biomedcentral.eom/1 471 -21 48/1 4/1 1 4 



Page 2 of 13 



[10-12]. The second hypothesis is that climatic changes 
during Pleistocene had a large influence on the forests in 
China; however, no large-scale vegetation experienced 
long-distance southward migration during glaciation and 
instead contracted into a few main refugia, as suggested 
by most phylogeographical studies in China [13,14]. The 
third hypothesis is that some species were slightly affected 
by Pleistocene glacial cycles that persisted in situ through- 
out the LGM and occupied multiple localized glacial refu- 
gia, as suggested by some phylogeographical studies in 
this region on species such as Taxus wallichiana [15], 
Cathay a argyrophylla [16], and Eurycorymbus cavaleriei 
[17]. Some species reportedly occupied a much larger area 
than today, e.g., Alsophila spinulosa [18], Pinus kwangtun- 
gensis [19], and Primula obconica [20]. Nevertheless, these 
studies have mainly focused on the plant species in the 
regions with high biodiversity, such as Qinghai-Tibetan 
Plateau, the Himalaya- Hengduan Mountains, and sub- 
tropical China, and only a few studies are about Chinas 
warm-temperate zone. However, this region has the 
highest population density in the country and has devel- 
oped agriculture. Dramatic climate change is adversely 
affecting the global ecological system and agricultural 
cultivation. Therefore, more comprehensive studies on 
species with wide distribution ranges in Chinas warm- 
temperate zone are needed to test the impact on this region 
during the Pleistocene glacial cycle. 

"Chinas warm-temperate zone" generally refers to the 
area between 32°30'-42°30' and 103°30'-124°10\ In this 
region, the typical vegetation is deciduous broad-leaved 
forest (DBLF) [21]. Since the Tertiary Period, the warm- 
temperate zone in China has not been strongly influenced 
by massive Quaternary glaciers. In contrast to the extinc- 
tion of a large number of broad-leaved species in Europe 
and North America, the majority of deciduous broad-leaved 
species is preserved in this region [22]. Although they 
have survived the Quaternary glacial period, the evolution 
and distribution of the warm-temperate DBLF were still 
affected by climate fluctuations. However, our knowledge 
on phylogeographical histories of organisms occurring in 
Chinas warm-temperate zone and their correlations 
with climatic fluctuations has been limited due to finite 
phylogeographical studies in this region, particularly 
for plants [10,23-25]. Thus, increased attention must be 
paid to Chinas warm-temperate zone and to the influ- 
ences of Quaternary climate change in this area based on 
large-scale genetic studies with extensive sampling. 

Forsythia suspensa (Thunb.) Vahl (Oleaceae) is a decidu- 
ous shrub widely distributed in Chinas warm-temperate 
zone with elevations of 300 m to 2200 m above sea level. 
As a typical component of DBLF, F. suspensa exists in most 
distribution areas of current DBLF in China. Consequently, 
we selected F. suspensa as a model for inferring phylo- 
geographical patterns in Chinas warm-temperate zone 



to understand DBLF population dynamics in response 
to climate change. 

In this study, two chloroplast DNA (cpDNA) regions, 
one nuclear ribosomal DNA (nrDNA) region, and eco- 
logical niche models were used to examine the phylogeo- 
graphical pattern of F. suspensa. Our specific objectives 
were to address the following questions: (i) what is the 
genetic structure of F. suspensa populations in China as 
revealed by cpDNA and nrDNA data; and (ii) how did the 
species response to the climatic oscillations during the 
Pleistocene. 

Results 

cpDNA diversity and population structure 

Out of the two cpDNA regions sequenced in F. suspensa 
(182 individuals, 20 populations), the two regions both 
showed length variation (psbA-trnH, 370 bp to 397 bp; 
£r«L-F, 782 bp to 783 bp). When combined, these se- 
quences (1153 bp to 1180 bp) were aligned with a con- 
sensus length of 1180 bp, and contained 16 nucleotide 
substitutions and 3 indels. Based on these polymorphisms, 
13 chlorotypes (CI to C13) were identified among all 
samples surveyed (Additional file 1). The 11 psbA-trnH 
and 7 trnL-F chlorotype sequences were deposited in the 
GenBank database under accession numbers KF366077 to 
KF366094. Among the 13 chlorotypes detected, the most 
widespread chlorotypes were CI (in 4 of 20 populations) 
and C2 (in 5 of 20 populations). The geographical distri- 
bution of chlorotypes CI to C13 and their occurrence at 
each locality are shown in Figure IB and Table 1. Only 3 
of the 20 populations were polymorphic (PI 2, PI 8, and 
P19), whereas the other populations exhibited only one 
chlorotype. The statistical parsimony network of chloro- 
types CI to CI 3 revealed that they were only one to seven 
mutational steps apart (Figure 1C). The haplotype and 
nucleotide diversities of cpDNA were h T = 0.600 and 
jt t = 1.58 x 10" 3 , respectively. The highest nucleotide 
and haplotype diversities were found in PI 8 and PI 9, 
respectively (Table 1). Seven private chlorotypes of 
populations were found in the species, with the highest 
in P18. 

Bayesian analysis of population structure (Figure 2) re- 
vealed that the highest likelihood of cpDNA data was 
obtained when samples were clustered into seven groups 
(K= 7; data not shown). The seven groups identified were 
as follows: Shandong (PI to P4), Henan-Shaanxi-Shanxi 
(HSS) (P5 to P8), Henan-Hubei-Shanxi (HHS) (P9 to 
P16), Song Mt. (P17), Wulaofeng (P18), Baota Mt. (P19), 
and Wuzhi Mt. (P20) groups. Non-hierarchical AMOVA 
(Table 2) revealed a strong population genetic structure 
for cpDNA sequence variation at the species-range scale 
(0 S t = 0.929; P < 0.001). However, most of this differen- 
tiation was partitioned among the seven groups (Oct = 
0.793), and only 14.83% was explained among populations 
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Figure 1 Geographic distribution and genealogical relationships of cpDNA haplotypes recovered from F. suspensa populations in 

China. Map was downloaded from http://www.geoatlas.com. A, Distribution ranges off. suspensa (red lines) in China. B, The geographic 

distribution of thirteen chlorotypes (CI -CI 3) detected (for population numbers see Table 1). C, The statistical parsimony network of chlorotypes 

CI -CI 3. The area of circles corresponds to the frequency of each haplotype. Each solid line represents one mutational step interconnecting two 

haplotypes for which parsimony is supported at the 95% level. 
\ J 



within each region (&sc = 0.715) (Table 2). Significant 
isolation by geographical distance for cpDNA was de- 
tected at the species-range scale (r = 0.343; P< 0.001). 
However, the isolation by ecological distance for cpDNA 
was not significant (r = -0.113; P > 0.05). 



Tajimas D and Fus Fs statistics for deviation from neu- 
trality were examined for the four groups including HSS, 
HHS, Wulaofeng and Baota Mt, and no significant values 
were detected. The other three groups, including Shandong, 
Song Mt. and Wuzhi Mt., were not calculated because each 
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Table 1 Details of population locations, sample size, cpDNA variation of F. suspensa sampled in China 



Population no. and code 


Locations 


Lat.(N)/ Long.(E) 


N 


Chlorotypes (nos. of individuals) 


nxlO" 3 


h 


N\ 


Shandong group 
















1 . SDBD 


Baodugu, Shandong 


35°00V1 1 7°42' 


10 


C11 (10) 


0 


0 


0 


2. SDMM 


Meng Mt., Shandong 


35°30V1 1 7°48' 


10 


C11 (10) 


0 


0 


0 


3. SDTM 


Tai Mt., Shandong 


36°157117°06' 


10 


C11 (10) 


0 


0 


0 


4. SDYM 


Yuan Mt., Shandong 


36 0 28V117°5V 


10 


C11 (10) 


0 


0 


0 


HSS group 
















5. HNJG 


Jigong Mt, Henan 


31°507114°05' 


10 


C12 (10) 


0 


0 


1 


6. SXU 


Laojun Mt., Shaanxi 


34°207110°15' 


10 


C5 (10) 


0 


0 


0 


7. SXHM 


Hua Mt., Shaanxi 


35°33'/110°06' 


10 


C5 (10) 


0 


0 


0 


8. SXWT 


Wutai Mt., Shanxi 


39°00V113°35' 


8 


C13 (8) 


0 


0 


1 


HHS group 
















9. HBDH 


Dahong Mt., Hubei 


31°317112°58' 


6 


C1 (6) 


0 


0 


0 


10. HNTB 


Tongbai Mt., Henan 


32°237112°50' 


10 


C1 (10) 


0 


0 


0 


11. HBWD 


Wudang Mt., Hubei 


32°24'/1 1 TOO' 


10 


C2 (10) 


0 


0 


0 


12. HNLY 


Longyuwan, Henan 


33°427111°45' 


10 


C2 (8), C3(1), C8 (1) 


0.81 


0.378 


0 


13. HNU 


Laojieling, Henan 


33°457111°20' 


10 


C2 (10) 


0 


0 


0 


14. HNJL 


Jiulian Mt., Henan 


35°357113°35' 


10 


C3 (10) 


0 


0 


0 


15. SXLK 


Lingkong Mt., Shanxi 


36°367112°05' 


10 


C2 (10) 


0 


0 


0 


16. SXTL 


Tianlong Mt., Shanxi 


37°427112°26' 


6 


C2 (6) 


0 


0 


0 


Song Mt. group 
















17. HNSM 


Song Mt., Henan 


34°287113°05' 


10 


C6 (10) 


0 


0 


1 


Wuloofeng group 
















18. SXWL 


Wulaofeng, Shanxi 


34°507110°35' 


10 


C7 (7), C10 (3) 


1.58 


0.467 


2 


Baota Mt. group 
















19. SXBT 


Baota Mt., Shaanxi 


36°357109°29' 


6 


C8 (3), C9 (3) 


1.53 


0.600 


1 


Wuzhi Mt. group 
















20. HBWZ 


Wuzhi Mt., Hebei 


36°307113°39' 


10 


C4(10) 


0 


0 


1 


Total 






186 




2.53 


0.867 


7 



tt: nucleotide diversity, h: haplotype diversity, and A/pc: number of private chlorotypes. 




Figure 2 Estimated genetic structure for K=7 obtained with the program STRUCTURE (Pritchard et al., 2000 [54]) for 20 populations of 
F. suspensa based on cpDNA data. Each vertical bar represents an individual and its assignment proportion (not shown) into one of seven 
(colored) population clusters (K). 

V J 
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Table 2 Non-hierarchical and hierarchical AMOVAs for cpDNA and nrDNA variation surveyed in populations of F. 
suspensa in China 



Source of variation 

cpDNA 

Non-hierarchical AMOVAs 
Total 

Shandong group 
HSS group 
HHS group 
Hierarchical AMOVAs 
Among seven groups 
Among populations 
Within populations 
nrDNA 

Non-hierarchical AMOVAs 
Total 

Shandong group 
HSS group 
HHS group 
Hierarchical AMOVAs 
Among seven groups 
Among populations 
Within populations 



d.f. 



19 

3 
3 
7 

6 

13 

166 



13 

352 



% Total variance 



92.94% 
NC 

100.00% 
83.05% 

79.27% 
14.83% 
5.90% 



20.66% 
2.86% 
37.36% 
14.67% 

-4.93% 
24.79% 
80.15% 



(P-statistic 



®st = 0.929 
NC 

®st= 1-000 
®st = 0.831 

(P CT = 0.793 
(Psc = 0.715 
®st = 0.941 



P ST = 0.207 
D SJ = 0.029 
Pst = 0.374 
D ST = 0.147 

P CT = -0.049 
Z) sc = 0.236 
Pst = 0.199 



P-value 



< 0.001 
NC 

< 0.001 

< 0.001 

< 0.001 

< 0.001 

< 0.001 



< 0.001 

> 0.05 

< 0.001 

< 0.001 

> 0.05 

< 0.001 

< 0.001 



CD: Fixation index. 

of them had a single chlorotype (Table 3). Similarly, 
mismatch analysis was also not performed for these 
three groups. Unlike the two neutrality tests, the HSS 
and Baota Mt groups showed demographic expansions 
for nonsignificant SSD and RAG values (Table 3). Accord- 
ing to the result of molecular clock calibration, the mean 
divergence time of nodes ranged from 270 Ma (node A) 
to 0.35 Ma (node L) or 1.90 Ma (node A) to 0.25 Ma 
(node L) (Figure 3), assuming, respectively, minimum and 
maximum rates of synonymous substitution in cpDNA 
[26]. This suggests that the divergence of the species fell 
into the Early-to-Late Pleistocene. 



nrDNA diversity and population structure 

One nrDNA (ITS) region was also sequenced in F. 
suspensa from 186 individuals (20 populations). The 
aligned sequences were 481 bp and contained 63 nu- 
cleotide substitutions. Based on these polymorphisms, 
74 ribotypes (Rl to R74) were identified among all 
samples surveyed (Additional file 2). The sequences of the 
74 ribotypes were deposited in the GenBank database 
under accession numbers KF366003 to KF366076. Among 
the 74 ribotypes detected, the most widespread haplotype 
was R2 (in 19 out of 20 populations). The geographical 
distribution of ribotypes Rl to R74 and their occurrence 



Table 3 Results of demographic analyses based on two datasets for seven groups and all samples of F. suspensa 







cpDNA 








nrDNA 






Groups 


SSD (P value) 


RAG (P value) 


Tajima's D 


Fu's FS 


SSD (P value) 


RAG (P value) 


Tajima's D 


Fu's FS 


Shandong group 






0 




0.007 (0.809) 


0.026 (0.852) 


-1.654* 


-14.601** 


HSS group 


0.003 (0.519) 


0.082 (0.462) 


1.518 


1.418 


0.022 (0.712) 


0.041 (0.822) 


-0.772 


-3.717 


HHS group 


0.018 (0.041) 


0.162 (0.021) 


-0.691 


0.279 


0.327 (0.000) 


0.151 (1.000) 


-1.769* 


-25.668** 


Song Mt. group 






0 




0.047 (0.259) 


0.123 (0.389) 


-0.153 


-1.941 


Wulaofeng group 


0.436 (0.000) 


0.720 (0.942) 


1.229 


4.487 


0.279 (0.000) 


0.466 (0.937) 


0.139 


0.955 


Baota Mt. group 


0.303 (0.069) 


0.880 (0.069) 


1.910 


2.759 


0.077 (0.217) 


0.126 (0.433) 


-0.665 


-0.244 


Wuzhi Mt. group 






0 




0.049 (0.020) 


0.724 (0.576) 


-1.440 


-0.674 



'*', P values, 0.05; '**', P values, 0.01; SSD, sum-of-squared deviations; Rag, Raggedness index. 
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Figure 3 Neighbor-joining (NJ) clustering of thirteen chlorotype sequences of F. suspensa. F. viridissima was used as an outgroup. The 
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at each locality are shown in Table 4. Nineteen of the 
twenty populations were polymorphic, and only P9 exhib- 
ited one ribotype. The haplotype and nucleotide diversities 
based on nrDNA data for F. suspensa were h T = 0.625 and 
tt t = 5.65 x 10" 3 , respectively. Nucleotide diversity (jt) 
among the 20 populations ranged from 0 to 23.25 x 10~ 3 
and haplotype diversity (h) varied between 0 and 0.868. 
The highest nucleotide diversity and haplotype diversity 
was found in populations P8 and P3, respectively (Table 4). 
A total of 61 private ribotypes of populations existed in 
the species, and the highest number of private ribotype 
was found in Pll. 

Bayesian analysis based on nrDNA revealed a progressive 
increase in L (K) until K=3 (data not shown). However, 
the resulting groupings did not correspond to separate 
geographical regions supported by cpDNA data (Figure 4). 
Non-hierarchical AMOVA and Nei s estimator of popula- 
tion substructure (G S t) indicated high levels of population 
differentiation for nrDNA in F. suspensa at the species- 
range scale (0 S t = 0.207; Table 2). Despite taking into ac- 
count the species' weak hierarchical (regional) substructure 
(0 CT = -0.049), overall levels of population divergence still 
remained high (0 S c = 0.236; P< 0.001; Table 2). However, 
no significant isolation by geographical distance (r = -0.019, 
P > 0.05) and isolation by ecological distance (r = -0.046, 
P > 0.05) were found for nrDNA. Using an nrDNA-derived 
^st (n) value of 0.207 across all surveyed populations (see 
above) and the corresponding value of cpDNA (0 ST (c) = 
0.929), the pollen/seed migration ratio (r) was calculated 
as 48.4, indicating that pollen flow was significantly higher 
than seed flow. 

Tajimas D and Fus Fs statistics based on nrDNA data 
were examined for each of the seven groups. Shandong 
and HHS groups significantly deviated from neutrality. 



However, mismatch analysis did not support the demo- 
graphic expansion of HHS group (Table 3). 

Climatic suitability inference by ecological niche modeling 

The predicted areas climatically suitable for F. suspensa 
for the current and past (LGM) are illustrated in Figure 5. 
The values of AUC based on both training and test pres- 
ence data for the present were all higher than expected by 
chance (0.995 and 0.998), demonstrating good model per- 
formance. Notably, the model indicated that F. suspensa 
experienced habitat fragmentation isolated by intervening 
unsuitable habitat, and significant isolation was found be- 
tween the Shandong group and other groups (Figure 5A). 
The area of climatically suitability value above 0.4 from 
the LGM prediction (430,602 km 2 ) was larger than that 
from the current prediction (311,860 km 2 ). The area of cli- 
matically suitability value above 0.8 at LGM (70,032 km 2 ) 
was also slightly larger than the current most suitable 
habitat (67,329 km 2 ). Obviously climatically suitable areas 
expanded in the Shandong group, retreated in the northern 
edge (Hebei and Shandong provinces) and extended 
southward to the edge of subtropical Chongqing and 
Hubei provinces during the LGM (see Figure 5B). 

Discussion 

Genetic diversity and spatial population genetic structure 

Generally, geographical distribution, breeding system, 
and population size all affect genetic diversity in plant 
species [27-29] . F. suspensa is an out-crossing species that 
has two mechanisms for avoiding autogamy, herkogamy 
and dichogamy [30]. F. suspensa is also a widespread 
species with a large population size. All these features 
should result in a relatively high genetic diversity of F. 
suspensa. However, the species-wide level of genetic 
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Table 4 nrDNA variation of F. suspensa sampled in China 

Population no. and code N Ribotypes (nos. of individuals) 

Shandong group 

I . SDBD 

2. SDMM 

3. SDTM 

4. SDYM 
HSS group 

5. HNJG 

6. SXU 

7. SXHM 

8. SXWT 
HHS group 

9. HBDH 

10. HNTB 

II. HBWD 

12. HNLY 

13. HNU 

14. HNJL 

15. SXLK 

16. SXTL 
Song Mt. group 

17. HNSM 
Wuloofeng group 

18. SXWL 
Baota Mt. group 

19. SXBT 
Wuzhi Mt. group 

20. HBWZ 
Total 

n: nucleotide diversity, h: haplotype diversity, and A/pr: number of private ribotypes. 



nxlO" 3 h A/pr 



0.684 1 

0.742 3 

0.868 7 

0.653 7 

0.442 0 

0.789 6 

0.805 6 

0.642 3 

0 0 

0.516 2 

0.758 8 

0.442 2 

0.574 3 

0.353 1 

0.368 2 

0.864 3 

0.416 0 

0.742 3 

0.682 4 

0.195 0 

0.625 61 



10 R1 (1), R2 (11), R6 (2), R13 (1), R15 (2), R16 (3) 2.80 

10 R2 (10), R6 (2), R20 (2), R40 (1) R41 (1), R42 (2), R43 (2) 6.47 

10 R2 (7), R16 (2), R26 (1), R40 (1) R47 (1), R48 (1), R49 (3), R50 (1) R51 (1), R52 (1), R53 (1) 8.09 

10 R1 (1), R2 (12), R68 (1), R69 (1), R70 (1), R71 (1), R72 (1), R73 (1) R74 (1) 5.25 

10 R1 (6), R2 (14) 0.92 

10 R1 (1), R2 (9), R29 (1), R30 (1), R31 (3), R32 (2), R33 (1), R34 (1) R35 (1) 3.98 

10 R2 (9), R7 (1), R13 (1), R20 (1), R22 (1), R23 (1), R24 (1), R25 (1) R26 (2), R27 (1), R28 (1) 6.11 

8 R2 (6), R65 (1), R66 (8), R67 (1) 23.25 

6 R2 (12) 0 

10 R1 (2), R2 (14), R3 (1), R4 (1), R5 (1) , R6 (1) 2.05 

10 R2 (10), R37 (1), R54 (1), R55 (1) R56 (1), R57 (1), R58 (1), R59 (1) R60 (2), R61 (1) 7.47 

10 R2 (15), R6 (2), R7 (1), R8 (1), R9 (1) 1.44 

10 R2 (12), R10 (6), R11 (1), R12 (1) 3.38 

10 R2 (16), R5 (3), R14 (1) 1.32 

10 R2 (16), R36 (1), R37 (1), R38 (1) R39 (1) 2.76 

6 R2 (4), R5 (1), R13 (3), R36 (1), R44 (1), R45 (1), R46 (1) 4.76 

10 R2 (15), R5 (4), R13 (1) 1.85 

10 R1 (10), R13 (1), R26 (3) , R34 (1) R37 (2), R62 (1), R63 (1), R64 (1) 4.48 

6 R2 (7), R17 (1), R18 (1), R19 (1) R20 (1), R21 (1) 6.36 

10 R2 (18), R5 (1), R40 (1) 0.81 

186 5.65 



diversity (h T = 0.625) was unexpectedly lower than that of 
other seed plants in China based on the ITS, such as Erio- 
phyton wallichii (h T = 0.908) [31], Achyranthes bidentata 
(h T = 0.703) [32], and Primula obconica (h T = 0.994) [20]. 
The species-wide level of cpDNA-derived genetic diversity 
(h T = 0.867) was moderate compared with 13 other seed 



plants used as maternally inherited markers in China [13]. 
Three possible factors contributed to the observed 
relatively low level of genetic diversity. First, the current 
climatically suitable areas predicted by the ENM suggested 
that F. suspensa may have experienced habitat fragmenta- 
tion, which was probably the key contributory factor 




Figure 4 Estimated genetic structure for K=3 obtained with the program STRUCTURE (Pritchard et al., 2000 [54]) for 20 populations of 
F. Suspensa based on nrDNA variation. Each vertical bar represents an individual and its assignment proportion (not shown) into one of three 
(colored) population clusters (K). 
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resulted in the loss of genetic diversity. Second, the low 
genetic diversity may be caused by increased human activity 
because this region had become a developed agricultural 
area since the Neolithic Period. Third, as a main Chinese 
traditional drug, over-harvesting of its fruit may have 
caused a sharp decline in the quantity of plants and may 
have also accelerated the loss of genetic diversity. 

In addition, cpDNA data demonstrated significant 
population differentiation within F. suspensa. Population 
subdivision based on cpDNA data (0 S t = 0.929) was much 
greater than the average level of other seed plants for 
maternally inherited markers (mean 0 ST = 0.670) [33]. 
Another striking feature of F. suspensa was the marked 
group differentiation based on cpDNA (<D CT = 0.793). 
Significant IBD pattern was indicated by cpDNA (r = 0.343; 
P < 0.001) analyses, suggesting that gene flow declined with 
increased geographical distance. Generally, gene flow esti- 
mations using plastid DNA markers are based on DNA 
transmission through seeds in most angiosperm species 
[34]. However, F. suspensa lacks efficient seed dispersal 
mechanisms, and seeds of F. suspensa are small with 
wing-like structures. The seeds are dispersed by wind, the 
dispersal distance is short, and most seeds are spread in 



the same population, which may be one of the main 
reasons for the high levels of genetic differentiation. In 
addition, environmental conditions in mountains at 
higher elevations markedly differed from those in the 
intervening plains, which probably acted as barriers to 
gene flow through seeds and further resulted in high 
levels of interpopulation genetic divergence. However, 
due to lack of long distance dispersal mechanism for F. 
suspensa, the significant genetic differentiation among 
populations didn't be caused by the ecological distance. 
Meanwhile, considering the cultivated land area in low- 
elevation plains, the DBLFs in the warm-temperate zone 
of China were fragmented, which led to patch-like habitats 
of F. suspensa. 

In contrast to the significant phylogeographical structure 
obtained with cpDNA, only a moderate level of genetic 
differentiation and phylogeographical structure was sug- 
gested by nrDNA when examined over all populations 
{&st = 0.207). Limited seed flow and high pollen flow 
among populations was considered to be the explanation 
for the nrDNA-derived population differentiation in 
F. suspensa. In fact, the pollen-to-seed migration ratio 
(r) obtained for F. suspensa (r = 48.4) was obviously higher 
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than the corresponding average value reported for seed 
plant species (median r ~ 17 estimated over 93 species) 
[33,35]. The spatial pattern of maternally inherited 
cpDNA differentiation has indicated the presence of 
obvious genetic structure; gene flow through long-distance 
dispersed pollen usually erodes the genetic signature of iso- 
lation by distance. Thus, our results represented another 
example of this situation and provided evidence of efficient 
pollen-mediated gene flow among the isolated populations. 
Considering the effects of such long-distance, pollen- 
mediated gene flow, forest fragmentation and habitat 
isolation among populations may not have played an 
important role in nuclear genomic diversification and 
speciation. However, analysis of isolation by geographical 
and ecological distance based on nrDNA data indicated 
geographical or ecological factor was not related to 
pollen-mediated gene flow, which may be due to the 
mutual interference among different gene pools (i.e., 
the seven groups identified by cpDNA). 

Inference of phylogeographical history in F. suspensa 

Climatic changes during Pleistocene glacial cycles are 
believed to have affected the present distribution pattern 
and phylogeographical structure of plant species [4,36,37] . 
However, the role of these climatic fluctuations in Chinas 
warm-temperate zone and their potential importance in 
the current population genetic structuring of regional 
vegetation is not well understood. In order to investigate 
the effection by climatic changes during Pleistocene gla- 
cial cycles, we selected F. suspensa as a model to test the 
three hypotheses which described in the introduction. In 
this work, the partitioning of genetic variability-based 
cpDNA had a significant geographical component, each 
group had its own unique chlorotype, and no chlorotype 
was shared among groups. Printzen et al, reported large- 
scale intraspecific disjunctions in many species that can 
alternatively be explained by range fragmentation and 
widespread long-distance dispersal [38]. Given the lack of 
effective seed dispersal mechanism, widespread long- 
distance dispersal may not be the main reason for in- 
traspecific disjunctions [39]. Thus, the heterogeneous 
chlorotype composition and genetic structure may be 
ascribed to range fragmentation. Fragmentation may be a 
consequence of isolation that can either be geographical 
or environmental. However, no obvious geographical 
barrier was found among the seven groups. Therefore, 
F. suspensa did not appear to be geographically isolated, 
thereby allowing ecological niche modeling to be used in 
assessing species status. The model prediction for the 
current time indicated that F. suspensa may have expe- 
rienced habitat fragmentation isolated by intervening 
unsuitable habitat. This phenomenon was particularly 
prominent between the Shandong group and other groups 
(Figure 5A). Our divergence time analysis revealed that 



the divergence of the species most likely fell into the 
Early-to-Late Pleistocene, about 2.70 Ma to 0.25 Ma, 
coinciding with frequent climatic oscillations during 
the Pleistocene, which is considered as one of the most 
important periods for genetic diversification and speciation 
[3,40]. Molecular clock analysis of chlorotype variation 
agreed with phylogenetic analyses in indicating the diver- 
gence of species long predating the LGM. Thus, we pre- 
sumed that the subdivision of the seven groups resulted 
from allopatric fragmentation in the past. 

To further infer demographic processes, two markers 
were used to detect each group. The apparent population 
expansion (inferred from the climatically suitable area 
expansion) was found only in the Shandong group, which 
was supported by the mismatch distribution, as well as 
Tajimas D and Fus Fs, values based on nrDNA data 
(Table 3), was also supported by ecological niche mod- 
eling (Figure 5). However, population expansion in the 
Shandong group interestingly occurred not during warming 
stages but during glacial stages (Figure 5). Although ap- 
parent population expansion was not detected in most 
groups by demographic analyses, we still presumed that 
the other six groups (except for the Shandong group) ex- 
perienced limited population expansion for high number 
of low-frequency private ribotype in most populations 
(Table 4). Similarly, these limited population expansion 
were also occurred during glacial stages, this opinion was 
also supported by ENM (Figure 5). Taken together, a sce- 
nario of repeated expanding to nearby low altitude plains 
in the glacial periods (similar to LGM, Figure 5B), and by 
retreating to mountaintops during interglacial warmer 
stages (similar to current days, Figure 5A) is the most 
likely for F. suspensa during the Quaternary climatic 
changes. However, population expansion did not result 
in continuous glacial distribution, and no geographical 
contact existed between groups. This suggestion was 
supported by the heterogeneous chlorotype composition 
of groups. Unlike cpDNA data, nrDNA data supported 
populations were clustered into three groups, with each 
group including samples from two or more separate 
cpDNA geographical groups (Figure 4). The discordance 
between the patterns revealed by cpDNA and nrDNA data 
indicated more extensive pollen- mediated gene flow than 
seed-mediated gene flow among the separate geographical 
groups. 

By assuming F. suspensa agreed with the first hypothesis, 
we would expect to see a high diversity in southern popula- 
tions and impoverished intrapopulation genetic diversity 
in the direction of recolonization [3,4,41], However, our 
genetic analysis based on molecular dada did not support 
this view, i.e., the two southernmost populations P5 and 
P9 did not have a higher genetic diversity (Tables 1 and 4). 
Ecological niche modeling also suggested F. suspensa was 
not compressed and underwent long southward migration 
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during the LGM. Under the scenario that the second 
hypothesis was appropriate for F. suspensa, present-day 
populations in the re-colonized area would probably 
show near genetic uniformity for derived chlorotypes 
and significant genetic diversity decline from refugia to 
the recolonized area. However, our data suggested that 
only the Shandong group had uniform chlorotype, and the 
other groups (i.e., HHS and HHS) had heterogeneous 
chlorotype. In addition, no significant longitudinal or lati- 
tudinal gradient in nrDNA diversity was observed in the 
three groups (i.e., Shandong, HHS, and HHS); the other 
four groups were not calculated because only one popula- 
tion existed. Based on analysis of isolation by ecological 
distance for cpDNA (r= -0.113; P>0.05), which further 
indicated no long distance dispersal occurred for this 
species. Thus, most aspects of the herein characterized 
phylogeographical structure of F. suspensa agreed with 
the evolutionary model of interglacial compression and 
glacial-limited expansion. Although F. suspensa obviously 
retreated at the northern edge, no evidence was found 
for the long-distance southward migration or large-scale 
range contraction during glaciation. 

Conclusions 

Analysis of molecular data and ecological niche modeling 
suggested that F. suspensa tracked Quaternary climatic 
changes by expanding to nearby low-elevation plains in 
the glacial periods and by retreating to mountaintops 
during interglacial warmer stages, thereby experiencing 
fragmentation and isolation. No geographical contact zone 
existed among groups during the Pleistocene glacial cycles, 
and extensive pollen- mediated gene flow weakened the 
genetic divergence among separate geographical groups. 

Methods 

Population sampling 

Silica-dried samples of leaf material were obtained from 
20 populations, representing almost the entire natural 
distribution area of F. suspensa. The distance between 
samples within populations was at least 10 m to increase 
the likelihood of sampling inter-individual variation within 
each population. Geographical information regarding these 
populations and numbers of individuals used in the cpDNA 
and nrDNA analyses are presented in Table 1. 

DNA isolation, polymerase chain reaction (PCR) 
amplification, and DNA sequencing 

Total DNA was extracted from approximately 30 mg 
of dried leaf tissue using a Plant Genomic DNA Kit 
(TIANGEN, Beijing, China) according to the manufac- 
turers protocol. For phylogeographical DNA surveys, we 
sequenced two regions of cpDNA (trriL-F [42], psbA-trnH 
[43]), including the internal transcribed spacer region 
(ITS) of nrDNA (ITS4-ITS5 [44]). PCR was performed in 



a reaction volume of 30 \iL containing 30 ng of genomic 
DNA, 0.2 mmol/L of each dNTP, 0.3 [imol/L of each pri- 
mer, 3 \iL of Taq buffer, and 1 unit of Taq polymerase 
(Takara, Dalian, Liaoning, China). The PCR protocols in- 
volved initial denaturation for 4 min at 94°C followed by 
35 cycles of 40 s at 94°C, 45 s at 50°C, 90 s at 72°C, and a 
final extension step of 8 min at 72°C. The PCR products 
were purified with an E.Z.N.A. Gel Extraction Kit (Omega 
Bio-Tek, Winooski, VT, USA) and then sequenced on 
an ABI 3730 DNA Sequence Analyzer at BGI (Beijing, 
China). Sequence quality was checked against the original 
chromatogram and aligned using CLUSTAL_X version 
1.81 [45]. For ITS sequencing, the presence of "double 
peaks" at polymorphic sites in the chromatogram was 
manually checked. Haplotypes were first determined by 
"haplotype subtraction" [46,47]. 

Molecular data analysis 

Sequences for each fragment were aligned by CLUS- 
TAL_X version 1.81 [45], and indels were coded as sub- 
stitutions following the method of Caicedo & Schaal [48]. 
Haplotype diversity (h) and nucleotide diversity (jt) were 
calculated for each population (h s and jt s ) and at the 
species level (h T and tt t ) using DNASP version 5.0 [49]. 
cpDNA haplotype networks were constructed in TCS 
version 1.21 [50], which showed all linkages between 
haplotypes with >95% probability of being most parsi- 
monious. Phylogenetic analyses of cpDNA haplotype 
sequences were performed with MEGA5.2 [51] using 
the neighbor-joining method on Kimura 2-parameter 
distances [52]. Bootstrap values were estimated to assess 
the relative support for relationships between haplotypes 
(1000 replicates) [53]. These analyses included Forsythia 
viridissima as an outgroup. Determination of the diver- 
gence times of lineages within species can help elucidate 
the historic events involving the species. Considering the 
lack of reliable fossil record in genus Forsythia, the average 
divergence time was estimated via calibrating molecular 
clock implemented in MEGA 5.2 [51]. Therefore, the mini- 
mum and maximum values of a range of average mutation 
rates reported for synonymous sites of plant chloroplast 
genes [i.e. 1.2 and 1.7 x 10" 9 substitutions per site per year 
(s/s/y)] were taken [26]. 

To infer the most likely number of population genetic 
clusters (K) in the cpDNA and nrDNA dataset, we per- 
formed Bayesian analysis of population structure as imple- 
mented in STRUCTURE version 2.2 [54]. K ranged from 
1 to 10, with 10 replicates performed for each K and using 
a burn-in period of 2 x 10 5 and 5 x 10 4 Monte Carlo and 
Markov chains. The "no-admixture model" and independ- 
ent allele frequencies were chosen for this analysis. The 
most likely number of clusters was identified using the 
maximal value of L (I<) returned by STRUCTURE [54,55]. 
Genetic divergence among populations was inferred from 
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Neis estimator [56] of population substructure (G S t) as 
well as from 0 ST obtained from non-hierarchical analyses 
of molecular variance (AMOVAs) in ARLEQUIN version 
3.5 [57]. Hierarchical AMOVA was also used to quantify 
the partitioning of cpDNA and nrDNA variance between 
regional groups of populations (<£> CT ) and between popu- 
lations within such groups (<£> sc ). Significance levels of & 
statistics were based on 10 000 permutations [58]. 

For both nrDNA and cpDNA data, tests of isolation-by- 
distance (IBD) were performed by regressing values of 
0 S t against the geographical distance (K m ) and the eco- 
logical distance (least cost distance) with the Mantel per- 
mutation procedure as implemented in IBD (Isolation by 
Distance Web Service: BMC Genetics 6: 13. v.3.16 http:// 
ibdws.sdsu.edu/) [59]. According the method of Acevedo 
et al. [60], we computed the ecological distance between 
populations using the least-cost distance algorithm imple- 
mented in ArcGIS 10 [61]. This algorithm calculates a de- 
terministic least-cost distance between a source population 
and a target population using a friction layer. Here, the 
friction map was obtained as one minus the predictions of 
the past model for the LGM [60]. A pollen/seed migration 
ratio (r) was calculated using a modified equation of 
Ennos [62], according to Petit et al. [33], with AMOVA- 
derived 0 S t values (instead of G S t) taken as estimators of 
population differentiation: r = m p /m s = [(1/C^st (n) - 1) - 2 
(1/0 ST ( c) - 1)]/(1/0 ST ( c) - 1), where m p is the pollen mi- 
gration rate, m s is the seed migration rate, 0 S t (n) is the 
nuclear (nrDNA) 0 ST , and 0 S t (c) is the cytoplasmic 
(cpDNA) 0 S t- 

To determine whether the cpDNA and nrDNA sequences 
satisfied the assumption of neutrality, we calculated Tajimas 
D [63] and Fus Fs [64] values for the entire species and 
groups of populations using ARLEQUIN. Statistical 
significance of D and Fs was estimated with coalescent 
simulations as implemented in this program. Generally, a 
significantly negative Tajimas D indicates an excess of 
low-frequency alleles that can arise from purifying selec- 
tion, rapid population expansion, and selective sweeps 
[65]. Fus Fs [64] is expected to have significantly large 
and negative values under conditions of demographic 
expansion. To further infer demographic processes, we 
explicitly tested the null hypotheses of the sudden ex- 
pansion model in ARLEQUIN by comparing observed 
and expected distributions of pairwise sequence differences 
(mismatch distributions). The sum of squared deviations 
(SSD) and raggedness index (RAG) were used to test the 
goodness-of-fit of the observation mismatch distribution 
to the expectation model. 

Past and current climatic suitability inferences 

To investigate the effect of cold periods (such as during the 
LGM) on climatic suitability of F. suspensa, we inferred 
the climatically suitable areas using ecological niche 



models. Assuming that the species did not change climatic 
preference (viz, niche conservatism [66]), the climatically 
suitable areas of F. suspensa at the LGM were recon- 
structed according to the current distribution by imple- 
menting a maximum entropy model Maxent 3.1.0 [67]. 
Maxent software is considered to be more robust than 
other methods for predicting species distributions [68,69]. 
Information on the geographic distribution of F. suspensa 
was based on a set of 50 presence points covering the 
entire distribution range of F. suspensa: 30 points were ob- 
tained from the Chinese Virtual Herbarium (http://www. 
cvh.org.cn/cms/) and 20 points were from sampling sites 
(Additional file 3). Current bioclimatic variables and LGM 
data were downloaded from the WorldClim database 
(http://www.worldclim.org/). For climate layers we used 
bioclimatic variables at 2.5 arc-minute resolution [70]. 
Current conditions were based on the observed data, rep- 
resentative of 1950-2000. LGM data were simulated using 
the Community Climate System Model [71]. All of 19 bio- 
climatic variables in the WorldClim database were used to 
model climatic suitability for the species. The area selected 
to calibrate ENMs has effects on the predictive perform- 
ance of the models [72]. It should be large enough to 
account for the full response of the species to the climatic 
gradients [72,73]. Since the focal species F. suspensa is a 
species endemic to Chinas Warm Temperate Zone, a set 
of 50 presence points used in this study covers most 
distribution areas of this species. We selected an area that 
covers this zone to calibrate the model, and the predic- 
tions for both the current and LGM were also made for 
this area. 

To construct ENMs, we used the default parameters 
of MAXENT and the following user-selected features: 
regularization multiplier of 1.0, application of a random 
seed, duplicate presence records removal and cumulative 
probabilities used for the output. To test the performance 
of each model, 20% of the data in each run was randomly 
selected by MAXENT and compared with the model 
output generated with the remaining data. The area under 
the receiver operating characteristic curve (AUC) was 
used to assess model performance [67]. 

Availability of supporting data 

The data sets supporting the results of this article are 
available in the Dryad Digital Repository: http://doi.org/ 
10.5061/dryad.f5c60. 

Additional files 



Additional file 1 : Chloroplast DNA sequence polymorphisms detected 
in two intergenic spacer (IGS) regions of F. suspensa identifying 
thirteen chlorotypes (C1-C13). All sequences are relative to the reference 
haplotype C1. Numbers 1/0 in sequences denote presence/absence of 
length polymorphism, identified by superscript letter (a, b, c). 
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Additional file 2: Nuclear DNA sequence polymorphisms detected 
in internal transcribed spacer (ITS) regions of F. suspensa identifying 
seventy-four ribotypes (R1-R74). All sequences are relative to the 
reference haplotype R1. 

Additional file 3: Geographic characteristics of 50 F. suspensa 
presence points used in this study. 
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